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VARIATION-NORM AND FLUCTUATION ESTIMATES FOR 
ERGODIC BILINEAR AVERAGES 

YEN DO RICHARD OBERLIN EYVINDUR A. PALSSON 


Abstract. For any dynamical system, we show that higher variation-norms for 
the sequence of ergodic bilinear averages of two functions satisfy a large range 
of bilinear U’ estimates. It follows that, with probability one, the number of 
fluctuations along this sequence may grow at most polynomially with respect to 
(the growth of) the underlying scale. These results strengthen previous works of 
Lacey and Bourgain where almost surely convergence of the sequence was proved 
(which is equivalent to the qualitative statement that the number of fluctuations 
is finite at each scale). Via transference, the proof reduces to establishing new 
bilinear bounds for variation-norms of truncated bilinear operators on K, 
and the main new ingredient of the proof of these bounds is a variation-norm 
extension of maximal Bessel inequalities of Lacey and Demeter-Tao-Thiele. 


1. Introduction 

Let T be an invertible bi-measnrable measnre-preserving transformation on a 
complete probability space (A, f2,/r). Given two measnrable fnnctions /i ,/2 on 
A, we consider their ergodic bilinear averages, namely 

MASi.mx) - -f^h(T'x)h{T^x) . 

n=0 

It was shown by Bonrgain in [2] that if /i ,/2 e L°°(A) then (Mfc[/i,/ 2 ](a;))fc^i 
is convergent for /r-almost every x e X. Thanks to a bilinear maximal fnnction 
estimate of Lacey na, Bonrgain’s resnlt remains valid for (/i,/ 2 ) e x for 
every (pi,P 2 ,<?) satisfying 

/ X 1112 

(1) - = — + — <q< CO, l<pi,p 2 <oo , 

q Pi p 2 3 

and this has been regarded as a bilinear analogne of the classical Birkhoff ergodic 
theorem. A similar resnlt also holds for a variant of (namely the ergodic 
bilinear Hilbert transform), see Demeter [3| and Demeter-Tao-Thiele [7]. 

Onr aim in this paper is to farther demonstrate that the seqnence Mfc[/i, f 2 \{x), 
k ^ 1, converges rapidly. To formnlate a conseqnence of onr estimates, we recall 
the notion of flnctnations of a given seqnence (ai, 02 ,...). Given a scale A > 0, 
the number of fluctuations in (a^) with respect to this scale is the largest number 
i such that there exists i disjoint intervals 

[ni, mi), [n 2 , m 2 ),... [ni, mi) 
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with the following properties: for every 1 ^ j ^ i it holds that |am^ — l ^ 1/A. 
It follows from the Cauchy criteria that (a^) is convergent if any only if it has a 
hnite number of fluctuations at every (hnite) scale. Thus results of [21 [H] could 
be interpreted as saying that: for almost every a; e X, at every scale, the number 
of fluctuations along Mk{f, g){x) is hnite. It turns out that this number grows at 
most polynomially as A —>■ oo. 

Theorem 1.1. Assume that pi,p 2 , q satisfying Q. Then there exists R < cc such 
that for every fi e and f 2 e the following holds: for almost every x e X 
the number of fluctuations in the sequence {Mk[fi, f 2 ]{x))k^i at any scale X> 0 is 
bounded above by O(A^), where the implicit constant is uniform over A but could 
depends on x and /i,/ 2 - 


For an interesting discussion about applications of huctuation estimates in er- 
godic theory, we refer the readers to Avigad-Rute [U (cf. Kovac mi)- 

Theorem 1.1 is an immediate consequence of Theorem 1.2| below, which provides 
a more quantitative estimate. To formulate this result, we recall the notion of 
variation-norm. Given 12 c M and a : 12 ^ C, let its r-variation norm be 


\\a{t)\\vi{n) ■= sup (|a(Xo)|’ 

n,No<—<N„ 


+ Y.ww) - a{N,wr) 


r\l/r 


i=i 


in the sup we require Nj e 12 for every j. We also use the semi-norm variant 
dehned similarly without the hrst term |a(Ao)|'’. 

Theorem 1.2. Assume that pi,p 2 ,q satisfying Q. Then there exists R < cc such 
that the following holds for every r > R: 


ATfc[/i, /2](a^)||L|(y:) 


< 


II/iIIpiII/2|Ip2 


Via a modihcation of standard transference arguments (which we will detail in 
Section |2), Theorem 1.2 follows from estimates for bilinear singular integrals. 
Theorem below. To formulate the result, we hx some notations. 

Given il : M ^ C sufficiently nice, consider the bilinear operator with kernel K 


(2) B[fij 2 ]{x) = f fi{x + y)f 2 {x-y)K{y) dy , 

Jr 

which is a priori well-dehned for Schwarz functions fi and f 2 - For any f > 0 let 
Bt be the bilinear operator with kernel t~^K{t~^y). 

We will be interested in : M —> C such that the following properties hold 
uniformly over ^ A 0: 


(3) 

(4) 


1^(01 ^ ’ 

d” - .11 


n ^ 1 


We will in fact work with K where (|^ holds for 1 < n < tt-q, here no is some given 
large number; now the implicit constants are allowed to depend on uq. In this 
case, we will say that K satishes (|^ and (|^ up to order ng. 
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Theorem 1.3. Assume that pi,p 2 ,q satisfies Q and r > 2. Then there exists uq 
finite such that if K satisfies ([^ and (|^ up to order uq then 

f2){x)\\Ll(^yr) < ||/i||lpi(R)||/2||i,P2(R) , 

where the implicit constant may depend on Uq and on the implicit constants of (|^ 
and (|^ for 1 ^ n ^ Uq. 

it can be seen that there is a dis- 


Comparing Theorem 1.3 with Theorem 1.2 


crepancy between the two ranges r > 2 and r > R. With the current transference 
techniques, it seems that to get the range r > 2 for Theorem |1.2| one would need 
a version of Theorem 1.3 that accommodates rougher KA, such as K{y) = l|j;|^i, 
which would be an interesting open problem left for future studies. In fact, in 


our transference argument we also prove a weaker version of Theorem 1.3 for this 
particular K where instead of r > 2 we only have r > R for some hnite i?, see 
Theorem 12.11 


Our proof of Theorem |1.3| could be viewed as a variation-norm extension of 
Lacey’s proof of the boundedness of the bilinear maximal function in [Tl] , although 
we will follow more closely the expositions in Demeter-Tao-Thiele [7] and Demeter 
[3]. The main new ingredient of the proof (compared to [El 131 [7|) is a variation- 
norm extension of maximal Bessel inequality for phase plane projections, which in 
turn relies on variation-norm estimates for Fourier projection operators associated 
with a collection of frequencies. Maximal estimates for these multi-frequency pro¬ 
jection operators were introduced in Bourgain [2], and variation-norm estimates 
for smooth multi-frequency Fourier projections were also considered in [T7]. In 
our context, it turns out that we need variation-norm estimates for sharp multi¬ 
frequency Fourier projections, similar to the original settings considered by Bour¬ 
gain. On the other hand, bounds would be sufficient for our purpose, and these 
estimates are proved in Theorem 8.1 by adapting an argument in Id- 

We mention some closely related works in addition to [21 [El 0 i. A dyadic 


version of Theorem 1.3 was considered in our previous work [S] (which in turn is an 
adaptation of Thiele [E] to the variation-norm setting). The method of proof in 
Demeter R] relies on a weaker version of Theorem 1.3 where the variation-norms 


are replaced by hnitary oscillation norms, which were also used by Demeter-Lacey- 
Tao-Thiele [6] (see also Demeter [HE], N azarov-Oberlin-Thiele CZI) to improve 
the RP ranges in the Bourgain return time theorem. For a nice introduction 
to variation-norm estimates in harmonic analysis, see Jones-Seeger-Wright |10j . 
The time-frequency analysis framework used in our proof originated from Lacey- 
Thiele’s proof of the boundedness of the bilinear Hilbert transform [El IE] • 

1.1. Outline of the paper. In Section ^we detail the transference argument that 
deduces Theorem 1.2 from Theorem In Section we discuss how a short-long 
decomposition of the variation-norm leads to a reduction of Theorem 1.3 to two 


sub-theorems, which respectively treat the contribution of the long-jumps and the 
contribution of the short-jumps. The proof of these Theorems will use restricted 
weak-type interpolation methods, which we recall in Section 4T In Section we 


recall standard terminologies in time-frequency analysis, which will be used in Sec¬ 
tion 1^ to describe some wave packet representation for the operators underlying 
the long-jump and short-jump contributions. Some old and new auxiliary esti¬ 
mates will be recalled and proved in Section 0 Section 0 Section 0 In Section H 
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we prove a new variation-norm extension of the maximal Bessel ineqnalities of 
Lacey [H] and Demeter-Tao-Thiele [^, which will be used in Section 10 and Sec¬ 
tion!^ to prove the desired estimates for the contribution of the long-jumps. In 


Section we briefly discuss the needed cosmetic changes that could be applied 
(to the treatment of the long-jump contribution) to get the desired estimates for 
the short-jump contributions. 


1.2. Notational convention. Given an interval I, we let c{I) denote the center 
of the interval, and for each constant G > 0 we let G/ denote the dilate of I 
around its center by the factor G. We will use e and i to refer to the numbers 
exp(l) and respectively, leaving their non-boldfaced counterparts free for 

other purposes. 

For every interval I let Xii^) = (1 + 

For each s ^ 1 we let Al® denote the L® Hardy-Littlewood maximal operator 

( 1 rx+R 

7 ^ IfivW dy 

Jx—R 

and abbreviate Ai := . 

Throughout the paper we let A denote the Fourier transform 

(5) m ^ A[h(.)](0 := r e ~^^-^^ hix)dx . 

Jr 

Note that with this normalization we have 

h{x) = r . 

Jr 


2. The transference argument 


In this section we deduce Theorem |1.2| from Theorem 1^ using a variant of 
standard transference arguments in I211Z]. Our hrst step is to show that the 
continuous version Theorem |1.2| holds, namely 


Theorem 2.1. For every t > 0 let St denote the following operator 

St[fi,f 2 ]ix) = 7 f fi{x + t)f 2 {x -t)dt . 

t Jo 

Then for every {pi,P 2 ,q) satisfying ([^ there exists R < co such that for every 
r > R it holds that 


\\St[fl, f2]{x)\\Ll{vp) ^ ll/lllpill /21 


\\P2 


( 6 ) 

Proof of Theorem 2J_. If {pi,P 2 ,q) that satishes 0 we let Uq = no{pi,p 2 ) be the 
constant required in Theorem |1.3 


Fix r below. We divide the proof into two steps. 

Step 1: Let Rq = 2(1 -I- (uq + 1 ) 7 )^) where uq = min(pi,p 2 , 2g) > 1. We hrst 
show that for r > i?o it holds that 


(7) 


sup XN{St, A)' 

A>0 


< 


II/1IUII/2II 


P2 


Clearly, we may hnd 1 < u < uq and vq > 2 such that r > ro(l -I- (no + l)^!^)- 
For brevity, let ni = (no + 1 ) 7 ^- 
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Now, for each 0 < a < 1/2 let Ka be a function supported in [0,1] such 
that ^ Ka{y) ^ 1, we may construct K such that for any 

n ^ 1 . 

It is clear that for any n ^ 0 and fc ^ 0 we have 

Therefore satisfies the assumptions (|^ and Q up to order hq (we em¬ 

phasize that the implicit constants are independent of a). Let Bt^a denote the 
bilinear operator with kernel It follows that for any tq > 2 we have 

(8) «”°^^l|5a,t[/n/2](T)||i.(y;0) < ||/i||pJ|/2||p2 • 

Let S* denote the positive maximal version of St, namely 


S*[fi,f 2 ]{x) = sup5t[|/i|, I/sIKt) . 

t>0 

By the bilinear maximal estimate of Lacey, it holds that 

Let M > 1 be such that u < mm{pi,p2,2q), then applying the above estimate for 
the triple we obtain 

( 9 ) l|s*[l/ilM/2r]‘''“ll»s||/ilUII/2||„ 

Now, for brevity in the following we understand that St = St[fi, f 2 ]{x), B^^t = 
BaAfiJ 2 ]ix), S* = S*[fij 2 ]ix), and = S*[\fi\A |/ 2 | 1 (t) 1 /T 

Given any sequence (or functions) {a(t),t e G}, let iV(a, A) be the number of 
fluctuations with respect to scale 1/A, i.e. the largest k such that there exists a 
sequence of k disjoint intervals [N/i, A^i), ..., [N'fc-i, iV^), where each Nj e G and 
furthermore la^. — aTv^.J > A for every 1 < j < /c. 

For any f > 0, using Holder’s inequality we have 

\St-B^^t\ ^ (2a)(“-b/-5^’“ . 

Let /? := ( 2 a)(“-b/«^ ^e have 

(10) NiSt,3/3S*’A^N{B^,t,PS*n 


here the fluctuation counts are used with respect to the t variable. Using the basic 
estimate AiV(a, A)b^° < ||a|| v^o and using ( 1 ^, for every rg > 2 we have 


lAo 


L% 

lAo 


= {2aA+^°\/3S*’^ ■ N(^St,3/3S*’^^ 

< a^+”°||H„,t[/i,/2](a;)||i|(y;o) < ||/i||piII/2I 

Using the Holder inequality and (|^, it follows that 

\\/3S*’^ ■ N{St,3/3S*’A^^^\\Ll ^ 


P2 


^ II Q*,U II 

I I r 9 


1 1 

1 + Tl]^ 
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therefore 

(11) < II/ 1 IUII/ 2 IU 

We note that this estimate holds for any 0 ^ ^ 1. Letting (3 = 2“^/3, fc ^ 0, 

and using the triangle inequality it follows that 

k^O 

Since N{St, A) = 0 for A ^ 2S*, and since 5'*’“ ^ S'*, it follows that 
||(S*-")-supA‘+']V(S„A)iSIiW|U, < II/ 1 IIP.II/ 2 IU . 

A>0 

Using Holder’s inequality and using (|^, we obtain 

II sup AiV(Si, A)"o(i+"i)(i+o 11 ^, < ||/i||pJ|/ 2 ||p 2 

A>0 

therefore by choosing e small so that r > (1 + e)ro(l + rii) we obtain ([^. 

Step 2: We now prove (§; the argument below is similar to an argument in 
[ 8 ]. We plan to use bilinear Marcinkiewicz interpolation: given each (pi,P 2 ;<?) 
satisfying ([^ we may let R to be the largest Rq of the exponents associated with 
any four rectangular weak-type endpoints. Let r > R, then we could use ([^ at all 
of these weak-type endpoints. By monotone convergence it suffices to show that 
for any increasing sequence of measurable functions (W) h holds that 

k 

Let T[/i,/ 2 ] = (Xifc |*S'ArJ/i,/ 2 ] - / 2 ]r)^^'’- By bilinear interpolation it 

suffices to prove the weak-type estimate 

A|{x:T[/i,/ 2 ](a;)>A}|i < II/ 1 IUII/ 2 IU 

with uniform implicit constants over A > 0. By scaling symmetries and dilation 
symmetry of St, we may assume A = ||/i||pi = ||/ 2 ||p 2 = 1- Let 

E = {x : sup \SNk - 5'vfc_il > 1} 

k 

Clearly |£'| ^ ||fV(S't, l)||q < 1. For x ^ E, we estimate T[/i,/ 2 ](x) by considering 
level sets for \Sn,, — (as a function of k) and obtain: 

T[fuf2]{xr S (X; 2 -'’’'iV(S,. 2 ^))'"’’ 

j>0 

Therefore by the Chebysheff inequality we obtain 

in:r[/i,/JW>l)|''» < \E\''’ + \{xtE:Tlf,j 2 ](x)>l}\^'‘' 

^ 1 + ( 
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Using ([^ for r = where 5 > 0 is sufficiently small so that r > R, we have 

J N{S„ 2-^y/^dx = ||iV(^*, 

Therefore 


J >0 


by choosing e > 0 sufficiently small depending on 5 (which in turn depends on r 
and R). This completes the proof of ([^. □ 

We now transfer Theorem 2T to the integers. Fix r > R. We’ll show that for 
any two sequences /i(n) and / 2 (r) indexed by Z it holds that 

m[fij 2 ]{n)\\ Ll{v-) ^ ||/i||rpi(z) 11/2II rp2 (z) , 




m=0 


To see this, let St be the bilinear operator dehned in (|^ with kernel t ^lo<y<o 
where f > 0. We extend fi and /2 from Z to M by letting: 

(i) Fi{x) = fi{n) if there exists n e Z such that \x — {n + 1/2)| < 1/3, and 
Fi{x) = 0 otherwise; 

(ii) F 2 {x) = f 2 {n) if there exists n e Z such that \x — {n — 1/2)| < 1/3, and 
F 2 {x) = 0 otherwise. 

Let 77, e Z and x e [n — 1, n + 1]. Then for any m e Z it holds that 

j Fi{x + y)F 2 {x - y)dy = + m)f 2 {n - m) 

Thus for any fc ^ 0 we have 

Sk{Fi,F 2 ){x) = Fi{x+ y)F 2 {x-y)dy =‘^Mk[fi,f 2 ]{n) , 

^ Jo<y<k ^ 

and consequently 

Il^fc[/i,/ 2 ](r)||v; inf ||*S'fc(Fi, F 2 )(x)||vy . 

3;e[n-g,n+g] 


It follows that 


\Mk[fl, f2]{n)\\Ll{V^) ^ ll>S't(F’i, F 2 )(x)||i|(y^r) , 


and using Theorem |2.1| we can bound the right hand side by 

~ ||-f^l||LPi(]R)||-U2||l,P2(R) = C/|/l||£Pl(Z) ||/2||£P2(Z) 


Our next step is to transfer the result on Z to a more general setting. Let T be 
a measure-preserving transformation on a complete probability space (X, 0,/i). 
Let / and g be given. 

Fix a large integer N, which we will send to oo later. All implicit constants 
below are independent of N and x. 




YEN DO 


RICHARD OBERLIN 


EYVINDUR A. PALSSON 


For fixed x, let M{f,g,N){x) be the r-variation norm of the hnite seqnence 
indexed hj 0 ^ k ^ N: 

i 2] f(n)g(T-'"x) . 

Note that for every 0 < n < the valne of M{f, g, N){T'^x) depends only on 
f{T^x) and g{T^x) with \m\ ^ 2N. Thns, nsing the Z-resnlt, it follows that 

2 \MU,g,N)(T"x)\< < ( ^ \g{,T’'x)rY'’^ 

Integrating over x s X and nsing the Holder inequality, we obtain 
H f \M{f,g,N){r‘x)\^dg{x) < 

|n|^A 

S ( H f mnTdgix))"'”' [ f \g(T"‘x)rdix(x)y'" . 


Using the fact that T is bi-measure preserving on we obtain 

\\M{f , g, N)\\Lq^X,ix) ^ \\f\\LPl{X,^l)\\g\\LP2{X,|J.) , 

and by sending iV —>■ oo we obtain the conclusion of Theorem 1.2 This completes 


the transference argument, and the rest of the paper is devoted to the proof of 
Theorem |1.3l We’ll assume that K satisfies ® and Q up to some large order 


that may depend on pi,p 2 ,q,r. We will also free the symbol St which could be 
used in the future for different purposes. 


3. Separation of short and long jumps 


For any function a(t) on M it is not hard to see that 


a(t)\\v{ ^ ll«(^)l|st + l|o(2”)||yj'(z) , 

neZ 


Applying this estimate to a{t) = Bt[fi, f 2 ]{x), the proof of Theorem 1.3 is divided 
into two parts: the first part handle the long-jumps (i.e. \\a{2^)\\vY{z)) and the 
second part handles the short jumps (i.e. ||a(t)|| 5 j. 


Theorem 3.1. For any r > 2 and Pi,P 2 ,q satisfying ([^ it holds that 

\\B2^{fl, f2){x)\\Ll^^{V;;^^) ~Pi,P2,?- II/i|IpiII/2|Ip2 

Theorem 3.2. Assume that pi,p 2 ,q satisfy ([^ and r > 2. Assume that Kg,! ^ 
s ^ 2 is a family of kernels such that Kg satisfies 0 up to a high order 
n-o = no{pi,p 2 ,q,r), and furthermore 

( 12 ) , 


and the implicit constants are uniform over 1 < s < 2. Then it holds that 
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Theorem |3 ■ 1 1 immediately takes care of the long-jump component of the variation 
norm ||i? 4 [/, ^f] ||yr. Below we deduce the desired estimate for the short jump 


component from Theorem 3.2 


We hrst note that if a{t) is differentiable then using ||.||p 2 ^ ll-Hpi we obtain 

ll®(^) Ilvj2|-2n^2’*+i] ~ II® (^) llTh2".2"+b ^ ^ II® ®)lli'l[l>2] • 

Therefore using Minkowski’s inequality we have 


\\Ll[l,2] 


nGX 


< 


1(2 


neZ 


We plan to apply the estimate to a(t) = f 2 ]{x) where x is hxed. Let 

h{y) = —{K{y) +yK'{y))^ or equivalently h{^) = Let Hf be the bilinear 

singular integral with kernel t~^h(t~^y). Then 

= H,[f,j2](x) , 

therefore 


neZ 

We may write H 2 r^s[fi, f 2 ]{x) = ^ fi{x + y)f 2 {x - y)2-^Ks{2-^y)dy with Ks{y) : = 
s~^h{s~^y), and it is not hard to see that Kg satishes (§. 0. ( [I^ uniformly in 
s e [1, 2], Thus, the desired estimates for the short jump component of f 2 ] 

follows from Theorem 13.21 

In the rest of the paper we prove Theorem 3T and Theorem 3^ We will use 
the restricted weak-type interpolation approach of [15] , which will be discussed in 
the next section. 


4. Linearization and interpolation 

4.1. Linearization. For each x consider a measurable function L : M ^ Z+ 
the set of positive integers, and two sequences of measurable functions: a non¬ 
decreasing integer valued sequence {kn{x))^^l and a sequence {an{x))n=i such 
that l®"^(®')r ^ 1- Then an appropriate choice of L and such sequences 

guarantees that 

L{x) 

ll-B2fc(/i, /2)(2:)||v^’'(z) ^ 2^ 

Similarly, for each s e [1,2] we may hnd a sequence of measurable functions 
(dn(s,a;))j[L_oo such that Zn \dn{s,x)\^ < 1, and 


B2^n[fl, f2]ix) - 52fe„_i[/l,/2](x)) 


ar,.(x 
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The desired estimates in Theorem 13.11 and Theorem 13.21 follow from certain 
restricted-weak type estimates for the following tri-linear forms, which we will 
discuss in the next section. 


^longiflj f2y /s) 
^shortiflj f 2 i /s) ~ 

^short,s{flj f2y fs) ~ 


2 (52^4/i,/ 2] -52^„-l[/l,/2])a„,/3 


n=l 
2 

■^short,s ifl, f2, f3)ds 


1 


^ neZ 


2 M- + y)f2{--y)2-"K.{2-’'y)dyd„(s,'),h 


4.2. Restricted weak-type interpolation. For any G c M with hnite Lebesgue 
measure, we say that R c G is a minor subset if \H I ^ \G\/2. 

Let a = (ai,a 2 ,tt 3 ) ^ 1^^ be such that oi -t- 02 + as = 1 and at most one 
aj could be negative. We say that a tri-linear functional A{fi, f 2 , fs) satishes 
restricted weak-type estimates with exponents a if the following holds. 

Case 1: min(ai, 02 , cis) ^0. Then we require existence of jo ^ {15 2,3} with 
the following property: for every triple {Fi, F 2 , F^) of hnite Lebesgue measurable 
subsets of M we could hnd B c minor subset such that 

( 13 ) a (/ i ,/ 2 ,/ 3 ) ^ \F,nF2r\F,r 


for any /i, f2, h with the following property: \fj\ < if j ^ jo, |/jol < 

Case 2: min(ai, 0:25 cis) < 0. Let k be such that < 0. By assumptions on a 
the other a^-’s are nonnegative. Then we require the above property with jo = k. 
Let A be the hexagon on the plane L = {ai -I- 0:2 + 0:3 = 1} with vertices 


(14) 


1 1 

2’ 2’ 

A2(|, 1), 

mIo,- 

to 1 

to 1 


216(1. j- 


By the interpolation argument of PI. to show Theorem |3.1| and Theorem |3.2| it 
suffices to prove that in any given neighborhood (in the plane L) of any vertex of 
A we could hnd a such that Aiongifi, f 2 , fs) and Aghortifi, f 2 , fs) satisfy restricted 
weak-type estimates with exponents a. (Note that when a is near a vertex of A 
it is automatic that at most one coordinate of a could be negative.) 

It will be clear from our proof (of the restricted weak-type estimates for all 
involved trilinear forms) that the index jo and the exceptional set B depend only on 
a and Fj, F 2 , Fj. Also, in the proof the choice of a (inside any small neighborhoods 
of any given vertices of H) will not depend on the underlying trilinear form. 

Therefore, a posteriori, to show the restricted weak-type estimates for Aghort it 
suffices to obtain the same estimate for Ashort,s (with the same set of exponents), 
provided that the implicit constants are uniform over s e [I 52 ]. This uniformity 
in turn is a consequent of the fact that the implicit constants in the assumptions 
for Kg are uniform over s e [ 1 . 2 ]. 

Similarly, in the proof for Along we’ll decompose it into a weighted sum of simpler 
trilinear forms, and it suffices to obtain the restricted weak-type estimates for 
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each of the new forms (with the same set of exponents) provided that the implicit 
constants are uniform. 


5. Terminology of tiles and trees 


In this section we recall some terminologies from HIEIE] that will be used in 
the proof. By a grid we mean a collection of intervals whose lengths are integral 
powers of 2 such that if /, /' are two intersecting elements then J c /' or J' c I. 
In addition to the standard grid Qq of dyadic intervals 2*[m, m + 1), we will use 
the grids 


r r • ^ ■ t \ 

Qi^t = j 2*(m + -), 2*(m + - + l)j : i = t {mod 4), m e Z 

where i and t are integers, clearly Gi^t depends only on i {mod 5) and t {mod 4). 
We will also make use of the grids 

: m e z\ 


^2 = 1 

It is clear that for every (not necessarily dyadic) interval I there is a d e {0,1, 2} 
and a. J e Qfi such that I ^ J and J 31] we then say that I is d-regular. 

A tile p is a rectangle Ip x Up cz of area 1 such that Ip is dyadic. A tri-tile P 
will consist of a quadruplet of intervals (Jp,cup^,cupj,(Upg) where Ip is dyadic and 
|/p||a;pj = 1 for each i. Associated to the tri-tile P are the three tiles Pi = Ip xcjp. 
which justify the notation that is implicit in the previous sentence. 

For each quadruplet of integers u = {ji,j 2 ,G,i) such that 0 ^ ji,j 2 ^ 4 and 
498 ^ |e| < 4002 and 0 < i < 4000, consider the collection of tri-tiles 


2*(m — - ^), 2*(m — ^ + 1) 1 '■ m e Z 


Gi = 


(- 1 )* (- 1 )* 

2*(m + ^^), T{m + + 1) 


P. = {( [2-^'m,2-^'{m + l)), 


[2‘'(n-f|),2*'(n + | + l)), 

(15) [2*'(n + e + |),2*'(n + e + | + l)), 

[2*'(2n + e + ^ + 1), 2*'(2n + e + ^ -f 2))) 


: m,n,i' e Z,i' = i {mod 4000)} 

Above, we clearly have up^ e Gji,i, ojp 2 ^ Gj 2 ,i, and cup^ e Gji+j 2 ,i- 

Fixing z/ for the remainder of the section (some dehnitions below depend on z/), 
we now recall, from jB] (cf. [IS]), some notions of order for tiles. 


Definition 5.1. For two tiles p,p' we write 

• p' < p if Ip! c Jp and 3up c 3ujp, 

• p' ^ p if p' < p or p' = p 

• p' ^ p if Ip' c Ip and Up c 10|e|a;p/ 

• p' p if p' ^ p and lOwp/ n lOcUp = 0 
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It is not hard to see that if P, P' e Py are two tri-tiles with P/ < Pi for some 
i e {1,2, 3} then Pj <' Pj for each j e {1, 2, 

The ordering above gives rise to the concept of a tree, which we recall below: 

Definition 5.2. Let i e {1,2,3}. An i-overlapping tree is a collection of tri-tiles 
T c Pj, together with a top tri-tile Pt e Py which satisfies 

P, < (Pt). for all P e T\{Pt}. 

We say that T is a tree if it is an i-overlapping tree for some i e {1, 2, 3}. We say 
that T is a tree with top if Pt ^ T. 

A tree T is called j-lacunary if 

P,- <' (Pt), for all P e T\{Pt}. 

It follows that a tree is j-lacnnary if and only if it is ^overlapping for some 
i e {l,2,3}\{j}, fnrthermore for each P e T we have sgn(c(a;pj — = Cij 

where we define e.j = sgn(e) for {i,j) e {(1,2), (1, 3), (3, 2)} and e.j = — sgn(e) 
for {i,j) E {(2,1), (3,1), (2, 3)}. We will abbreviate It '■= Ipj.- 

Definition 5.3. We will say that a collection of trees T is strongly j-disjoint for 
some j e {1, 2 , 3} if 

(1) Each T E T is j-lacunary 

(2) IfT,T' eT andT then T r,T' = 0 

(3) IfT,T' eT,T^T',PeT,P'e T' , and ojp. c cjpj then Ip/ n Ip = 0 

(4) IfT,r eT,T^ r, and P' e V then P' ^ [p^j ' 

Note that, due to our choice of order on tiles, the condition (Q above is some¬ 
what nonstandard (in comparison with, say, 0). Also note that conditions (|^ 
and (|^ imply that if T, T' e T, T 7 ^ T', P e T, and P' e T' then P, n Pj = 0. 

6. Discretization 

In this section, we discuss discretization, i.e. wavelet representation, for Along 
and Ashort,s- We’ll largely follow [3j. We’ll discuss in details the process for Along, 
the discretization for Assort,s will be similar and discussed at the end of the section. 

6.1. Cancellation between dilates. The point of conditions ([^, (|^ is that 
they allow one to decompose K into much simpler kernels: 

Lemma 6.1. If K satisfies ([^ and (|^ then 

00 

(16) K{0 = , 

jeZ 

where {c, },- e £^(Z) and each Kj could be furthermore written as the sum of dilates 
of a single generating function Kj{j = where supp((j)j) c {500 < 

|.^| < 4000}, and it holds uniformly in j that 

(17) |0(i)| < C„,„(l + III)-” 

for every m,n 0 . If K satisfies (|^ and (|^ up to some high order then (0 
holds for m,n ^ M with M comparably large. 
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The utility of this approach lies in the following cancellation between dilates of 
Kj-. for every integers ki ^ ^2 we have = 2]fcis;r<fc2 

which turns out to be convenient for reducing Along to wavelet operators. Namely, 
by pulling out the sum in j, we thus see that the consideration of Along reduces to 
considerations of Aj, dehned by: 

Ajifi, f2, fs) ■= S f2]^an, , 

X'fT' 1 kji —/ 

B^^,e[fi,f2]ix) := ^ fi{x + y)f2{x-y)2-^(j)j{2-%) dy , 


and B^.^i could be decomposed into a hnite number of discrete wavelet operators 


at scale i; this decomposition will be discussed in Section 6.2 


Proof of Lemma \6.1\ Observing that the given assumptions on K implies the ex¬ 
istence of K{0+) and i^(0—). We consider two cases. We’ll only consider the 
setting when (|^ and Q hold for all orders; the hnite order case could be achieved 
by the same argument. 

Case 1: Suppose that iC(0+) = K{0—) = 0, then using the given assumptions 
on K it follows that for every n ^ 0 it holds that 

- 


(the improvement is at n = 0). Let r; be a nonnegative bump function on 
{1000 < |.^| < 2000} such that Yjj vi‘^^0 = 1 every 7^ 0. Let 


K,{0 = 2\^\K{2-^Ov{0 


which is supported in {1000 < |.^| < 2000}, it is routine to check that (16) holds 
AH ~ 

di” 

(f)j has the desired properties. 


with Cj = 2 and <„ 1 for all n ^ 0. Let = KjiO ~ then 


Case 2: (iL(0+),iC(0—)) 7^ (0,0). Let ip be such that (p is supported on 
[—2000,2000] and is in (^“(M — {0}), and Hp{^) = K{0+) for ^ e [0,1000] and 
(p{^) = K{0—) for ^ E [—1000, 0). Then by writing K = ^+{K — ^) and applying 
the analysis in Case 1 for K — (f, we are left with ip, for which we will decom¬ 
pose directly into the sum of dilates of a single generating function. Namely, let 
0(0 = ^(0 ~ ^(20) it is clear that 0 satishes the desired properties. □ 


6.2. Wave packet representation. Below we will decompose B^.^e[fi, f 2 ] into 
wavelet sums. For convenience of notation, we will suppress the variable j, namely 
below 0 = <pj whose Fourier transform is supported on {500 < |^| < 2000} and 0 
satishes 0 up to sufficiently high order. 

Definition 6.2. We say that 0 is an wave packet of order M adapted to a 
tile p = I X CJ if is supported in cj and the following estimate holds for all 
0 < m, n < M: 

d^ / \ 1 

(18) ^ CM,N,n,m-^Uxr 
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Lemma 6.3. Given any M,N > 0, if (j) satisfies (17) up to sufficiently high order 


and (j) is supported in {500 < |.^| < 4000} then for each £ e Z f 2 \{x) can 

he written as the sum over v = (ji,j2,e), 0 ^ ji,j2 ^ 4 and 498 < |e| < 4002 
integers, of 


(19) 


y 2-^iJi 




(/i, V'j.pi) (A. i’i.pj) i’l.p.iiO 


PeF,,:\uj\=2-^ 


where (uniform over tri-tiles P s i = 1,2,3, and j iff 0) is an wave 
packet adapted to Pi of order M (the constants in (18) may depend on M,N, u). 


Proof. We consider M = oo below, the finite case is similar. Recall that the Fonrier 
transform P is defined by ([^. We first make several remarks abont Snppose 
that the snpports of and ^>2 are contained in intervals uji,uj 2 respectively. Then 
the identity 

^ f^2]] (0 = f - h)t^2(h)0(2^(2i? - 0) dri 

Jr 

gives rise to two observations. First, if i/’ 2 ) does not vanish then 

(20) (oil - U 2 ) n {500 • 2-^ ^ |^| < 4000 ■ 2"^} ^ 0. 


Second, the snpport of Fonrier transform of B^^i{f!i,'if 2 ) is contained in 

(21) OJi + Ci22. 


Now, tnrning to spatial localization, we have 

Since (f) satisfies lEIl. it follows that if for some f, Xi, X 2 we have 
d^ 




dx'^ 


for each i = 1,2, and for every n,m ^ 0, then 

(23) 


dx^‘ 




< 


< 




\X1 - X2\._ 


)-”^(l + 


\x - {xi + a; 2 )/ 2 L ^ 
2i > 


for each n,m^ 0. (Here we emphasize that Cm,fs are independent of f. 

jjeZ \ r\ 5 


Fix a Schwartz fnnction fi snpported on [0, 2/5) snch that \ fj(- — = 1. 


For each pair of intervals {I,uj) with |a;||/| = 1 let 

fi,u2 = _ c(a;))) 


which is snpported inside the right half of u. Using a Fonrier sampling theorem, 
for any Schwartz fnnction / it holds that 

4 

j=0 ojeQo leQo 

|tj|=2-'^ |/|=2^ 









ESTIMATES FOR ERGODIC BILINEAR AVERAGES 


15 


Let Ui = cj + y |a;| and U 2 = ui + {^ + e)\u\. By (20), it follows that i?y£[/i, /2 ](t) 
can be written as the snm over triplets of integers (ji,j2,e), with 0 < Ji,j2 ^ 4 
and 498 < |e| < 4002, of 

(24) DLL 

Ii^Go I 2 GG 0 
|cj|=2-^ \h\=2^ \l2h2^ 




By (21), ^/i,72,w,iij2,e is snpported on 003 := [0, |a;|) + 2c(a;) + + ^)A\ 

by (23), satisfies 


(25) 


<£_ n-2.l(2o(u.)+(ii±a+e)|i>|)j., 


dx"^' 


® ^ V 2 ^ 5 21 12 J2,el(^2 


< 




2^ 

|x-c(/i)| , 

2^ ^ 


2^ 

for each n, m ^ 0. (Note that ji, ^2, e are bonnded.) 

We now fix Ji and fnrther divide the right hand side of (24) according to j : = 
2“^(c(/i) — 0(12)). For j = 0 i.e. for terms in the snm (24) where Ji = I 2 ='■ I 


we may dehne the tri-tile P = (/, cji, a;2,1V3) and the corresponding wave packets 
natnrally 

'0O,P,l = ) V'0,P,2 = '0/2,2.22 5 00,P,3 = |-^r'^^0/i,/2,‘.2,ii,j2,e ■ 

The remaining terms can be dealt with by nsing the rapid decay in |c(/i) — c(J2) |, 


(25): we still define 0yp,i = 0/i,cji, however to shift the localization of 0/2,w2 to Ji 
we define 

Ap ,2 (1 + _ 

i.,„ 2''2(i + 


for some large L. (The rapid decay in (25) takes care of the extra factors in 0j,p,3.) 

Finally, we split (24) np one more time so that whenever |/| > |J'| we have 
|j| = 240 oo^|J'| for some positive integer k. Note that while this splitting gives rise 
to the sparseness reqnired by Pj,, it also means that we need to relabel and rescale 
the 0j,p,j slightly to maintain the seqnence of weights □ 

It follows that to prove restricted weak-type estimates for Along, we are left with 
showing the following theorem. In the theorem, u = (ji, ^2, e,z) is any qnadruplet 
of integers snch that 0 < ji,j 2 < 4, 498 < |e| < 4002, 0 < / < 4000, and is 


defined by (15). 


Theorem 6.4. Let r > 2 and pi,p 2 ,q satisfy ([^. 

Suppose that (uniformly over tri-tiles P e Py, z = 1, 2, 30 fjp^i is an normal¬ 
ized wave packet adapted to Pi up to order M sufficiently large (the required M 
may depend on Pi,P 2 ,q)- 
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Then the trilinear form 

Xj Xj (/d'0P,i) (/2,'0P,2)'0P,3 Onj/s 

^n=lpgp^: 2''"-i!£|7p|<2fcn 

satisfies restricted weak-type estimates with exponents a arbitrarily close to any 


given vertex of A defined by (14) 


Recall that \0‘nix)f' ^ 1. For convenience, let ap{x) := Omix) if m = 

m{P,x) is the unique integer in {1, ..., L{x)} satisfying ^ |/p| < 

and ap{x) = 0 if such m does not exist. We also let fip^fix) = ap{x)fip^ 3 {x) and 
4>p,i = '>Pp,i for i = 1,2. Let P be a hnite subset of P^. It suffices to demonstrate 
that the trilinear form 


(26) 


Ap(/l, /2, /s) — ^ ^ I-^pI {fl, 0P,l) (/2, fipp) 4>P,3, /s^ 


is of restricted weak type with exponents a, with P-uniform implicit constants. 

6.3. Discretization for Ashort,s- Recall that 

AshortAfiJ2j3) = (Y,\fii-+y)f2i--y)2-^Kfi2-^y)dydAs,-),f3) 


Using Ks{f) < |,^| we could proceed as in Case 1 of the proof of Lemma 
obtain a decomposition 


6.1 


and 


K .(0 - 


where Kgj is supported in {1000 ^ |.^| < 2000} and <„ 1 for all n ^ 0. 

Thus it suffices to consider restricted weak-type estimates for 

L f /i(' + d„(s, ■), h 

\neZ 

Xi ^X,.,,n[/l,/2](-)c^n(s, ■),/3 


Now using Lemma 6.3 with Kg j playing the role of 0, it follows that to obtain 


the desired restricted weak-type estimates for Aghort we are left with showing the 
following theorem. Below, p = (ji,j2 ,e,f) is any quadruplet of integers such that 


0 < ji,j 2 < 4, 498 ^ |e| < 4002, 0 < f < 4000, and Pp is dehned by (15). 


Theorem 6.5. Let r > 2 and pi,p 2 ,q satisfy ([^. 

Suppose that (uniformly over tri-tiles P e P^, i = l,2,3j, fip^i is an wave 
packet adapted to Pi up to order M sufficiently large. 

Let {dn)nez be a sequence of measurable functions such that Zn \dn{x)\‘^ < 1. 
Then the trilinear form 
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satisfies restricted weak-type estimates with exponents a arbitrarily close to any 
vertex of A defined by (14). 


7. Auxiliary estimates 


The following bound follows from the Lepingle inequality and a square function 
argument, see [10] for details. 


Lemma 7.1. Let ( be any Schwartz function. Let Ck{‘) = 2 ^C(2 Then for 
r > 2 and 1 < s < cc it holds that 

IKCfc * /)(T)||l,j(vy(Z)) <r,s ||/||l- 

Next, we have a Rademacher-Menshov type lemma: 

Lemma 7.2. Let /i,..., /tv be functions on a measure space X such that for every 
sequence of signs ei,...,eAre{l,—1} it holds that 

(27) Iki/i + 62/2 + • • • + ^NfwWi^ ^ B. 

Then 

n 

i=i 

Proof We rewrite fj{x) = fn{x) + /i(T). Estimating the 

by the norm, it is clear that 

1/2 


norm 


N 


|/n(T)||Li(y?) 


< 


Si/, IT 

0=1 


< 


E. 


eiv,£n 




jllL2 


< B . 


It remains to consider the contribution from Xji<j<n fj- each ns {1,..., iV} 
we will decompose [l,n) into disjoint subintervals. 


[0, n) = y 


LV. 


n,m 1 


msClog2(Y) 

as follows: Let I be the dyadic interval of length 2™+^ that contains n. If n is on 
the left half of I then let uJn^m = 0- If n is on the right half of I then let uJn^m be 
the left half of I. It follows that 


Yj 


0 (0) 


^ S II S 


0 ( 0 )- 


j<n 0 !£m!£log 2 (Y) jebJn.m. 

Since for each m, Un^m is constant (in n) on dyadic intervals of length 2™, we have 

II 2 /i(^)ll0(0) ^ IK Z l2]/i(^)K)'^"" 


0 


jGUJrL,m 


LO dyadic: \uj\=2'^ 


^ B. 


here the hnal inequality follows from another appeal to (27) using sequences {efi 
that are constant (as functions of j) on dyadic intervals of length 2™. □ 

We’ll also use a Bessel inequality. Lemma 7.3] For a proof see e.g. [71 Proposition 
13.1]. Below recall that fipj are (unmodihed) L^-normalized Fourier wave packet 
up to sufficiently high orders: 
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Lemma 7.3. Let j e {1,2,3}. Let T be a collection of strongly j-disjoint trees, 
let Q = Ut^t T, and suppose that || Stgt ^ Then for any sequence of 

coefficients {&p}peQ 

II y] bpfjpj\\L2 < log(l + L)||6p||£2(Q). 

PgQ 


8. A VARIATION-NORM MULTIPLIER ESTIMATE 


In this section we consider a variation-norm version of Bourgain [21 Lemma 
4.11], namely Theorem 8.1 below. In the following, let < ... < be real 
numbers. For each integer k we denote the sharp multi-frequency projection at 
scale k onto by Ilk[f] = J'"^[lp^/], where Rk = + 

Theorem 8.1. For every r > 2 and e > 0 it holds that 


lin4/](2^)IUiW) < a,.N-\\fh. . 

A variant of Theorem |8.1| with smooth multi-frequency projections was consid¬ 
ered in ini, where a range of L^ estimates was obtained; for the current paper we 
need sharp frequency projections, but L^ is sufficient. 

The starting point of our proof is Lemma 3.2 from mi: 

Lemma 8.2. Suppose that {cfc}”^o ® sequence in , and 2 < q < r. Then 


N 

A 

i=i 




2iTi^jy\ 


a6[0.1] 


(LT^o) 




_ 


where C may depend on r, q and min^ — ^j_i 


Through a standard averaging argument (see e.g. the proof of Proposition 4.1 
in HZl), the lemma above gives 


Proposition 8.1. Let x &e a smooth function such that x A identically one on 
[—0.9,0.9] and supported on [—1,1]. Assume that ^j+i ^ + 1 for each j, and let 

Xk,j be defined by Xk,j{0 = x(2*'('C ~ 0))- Then for r > 2 and e > 0 it holds that 

II ^ iXk,j * f){x)\\LUV^^o) ^ C'r,e,x^i/I|L2 


Proof. To keep the paper self-contained, we sketch the averaging argument. Let 
fj{x) = F'-^[l\^-^.\^if{f)]{x). Since ^/s are separated, we have ||/||2 ||(/i)||L 2 (£ 2 )- 

Let M be the best constant such that if suppiffjj) c — 1, -t- 1] for all j then 


l^j^N 


by the triangle inequality and Lemma 7.2 it is clear that M = O 
aim is to show that M = Oe,r,x{Th'^)- Since gj is supported on 
\y\ small we have 


XX 


(N) < 00 . Our 

- 1,0 + i]> 


\\gj{x)-e‘^^^^^ygj{x-y)\\L2^e2^ < |i/| |0jj|p2(^2) 
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Averaging over 0 ^ y ^ 6 with 1 < 5 < 1 sufficiently small, we obtain 


2 ixk,j * gj){x)\\Li{v-^^) 


< 


CA 2 




M, 


iXk,j * 9j){x - y)\\Ll + TV II II Li (£ 2 ) 




M, 


C5II Yj + yll^ill^ih?) 

IsSjsSW 


using translation invariant and Fubini. Using Lemma 8.2 for each fixed x it follows 
that for g > 2 sufficiently close to 2 (depending on e > 0) we have 

II ^ {Xk,j * gj){x)\\i,2(^Yr^^'^ ^ IKXfcj * gj)(a;)||^2(y9(£p) + —||gj||^2(£2) 


ligjigAr 


< 


M, 


Ce,q,r^ \\iXk,j * 9j){x)\\eyLl{V^)) ^ 9 IIS'j II Li(£i) 


M, 


^ ^<^,q,r,x^ ll5'ill£i(L2) + r, I|5'jIIl2(£2) , 


in the second estimate we used q > 2 and in the last estimate we used Lemma 7.1 


Since this holds for arbitrary (gj) satisfying supp{gj) c — 1, +1], by definition 

of M we obtain M ^ therefore M = desired. □ 

Now, using Proposition 8.1 and a simple square function argument, we obtain 
a frequency separated version of Theorem 8.1 Namely, if ^j+i ^ + 1 for each j 

then for r > 2 and e > 0 it holds that 

||nfc[/]||L2(y:^J < U,.,eiV"||/||i2 . 

To remove the frequency separation requirement ^ + 1, we will need the 


following estimate, which will be proved using Lemma 7.2 


Proposition 8.2. Suppose that S is a finite set of integers. Then 
(28) ||n,[/]||^ 2 (v^^ 2 ^^) ^ U(l + log (|^|))||/||^2 

Proof. Let n = |5|. Let si < ... < s„ be elements of S. For j = 1,..., n — 1 write 

and let /„ = /]. Then, the fj are orthogonal in L^(M), and n 5 j.[/] = 

Tjj^k fv Lemma 7.2 gives (|^. □ 


Proof of Theorem 8.1. By monotone convergence, it suffices to prove 

for every finite interval [a, b], provided that the constant is independent of [a, b]. 
Now, we may choose {kj}X^ with 
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SO that the number of connected components of Rk is constant on each interval 
\kj, kj+i). Then, for each x 


N 


1/2 


l|nt[/](x)||v; 


A:G[a,b] 




C||nij/](a:)||„, + c H l|ni.[/](x)|| 


yr 


0 = 1 


The contribution to the norm of the hrst term on the right above is acceptable 
by Proposition 8.2 Furthermore, for each fc, k' e \kj-i, kj) we have 

Uk[f]{x) - Ukif]{x) = Uk[f,]{x) - Ukimx) 


where fj = R ^ “ ^Rk )f]- Thus, using the orthogonality of the fj it 

suffices to show that, for each 1 ^ j ^ iV, 


l|nfc[/](x)||z,2(y. 






Cr,eNV\\L^ 


Fix j and let M be the (constant) number of connected components of Rk for 
k e \kj_i, kj), clearly M ^ N. For k e kj) we can write Rk as the disjoint 

union of open intervals 


Rk 


U hk 


where R^k' c R^k for k' > k. Let = {.^i,..., and dehne 

L> {Ce[l,M]:|/,,i,_. nn| = l) , ni[/] ^ 

teLtt 

and let = [1,M] — L** and dehne fl^ analogously. Clearly 11^ = n|, + 11^. 
Rescaling by a factor of an application of the known frequency-separated 

case immediately gives 

3 Rj 

and so it remains to consider 11 ^ For each i e R’ let 


Q = mm{R^kj.i Q) , pg = max{R^kj.i n hi) 
and R = (0, Pe), so that R^k = (R - R] ^ {Ph Pi + 2"^). Now, dehne 

n^b/] = L 

leL'° 

n“[/] - L ■^^‘[i(„-2-*,p,+2-‘)/] . 

leL^ 

For k e [fcj-i, kj) we obtain the decomposition 

nb/] - 9 + nbPi] + nfM 

where g = Rf) (which stays the same under fl^) and 

'‘1 - 

leL^ leP> 
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Rescaling by a factor of an application of the known freqnency-separated 
case then gives 

J 

for i = 1,2, hnishing the proof. □ 


9. Size and a variation-norm size bound 


We will nse the following standard notion of size: 

Definition 9.1. Let j e {1,2, 3}, P c P*,, and f be a function on M. Then 

1 


1/2 


sizej(P,/) = sup — 2 I (/,0P,j) 


TcP 


PeT 


where the supremum is over all j-lacunary trees contained in P and where the 
functions (fpj are defined in Section^ 

The aim of this section is to prove: 

Proposition 9.1. Let s > 1, r > 2, and P c P**. Then for j e {1, 2, 3} 


sizej(P,/) <* 


sup 

P,P'eP Ip^I^Ip, 


^ \f{xWxi{xf dx 


1/s 


where the inside supremum is over dyadic intervals. 

We will make use of a John-Nirenberg type lemma, proven in [T6] 

Lemma 9.2. Let {cpjpgp be a collection of coefficients. Let j e {1,2,3}. For 
1 ^ p < cc let 


Bp = 


sup 

TcP 


It\Tp 


PeT 


CP 


1/p 


)l/2 


LP 


where the sup is over all j-lacunary trees, and define i?i,oo analogously. Then 


Bo 


< 


CB^^, 


Recall that u = (ji,j 2 ,e,z) with 0 < ji,j 2 ^ 4, 498 < |e| < 4002, and 0 < z < 
4000. We will also need the following lemma: 


Lemma 9.3. There is a Schwartz function ( such that for each j-lacunary tree 
T c P*,, j e {1, 2, 3}, each sequence of coefficients {cpjpgp and each integer k with 
2^ < |/p| and —k = i mod 4000 we have 


X; cp.4'p.i(x) - 

PeT: |/p|>2'“ 






2 <^P^Pu 


PeT 


X 


Proof. One can check that for each P with |/p| < |Jp|, we have 

^ c(a;(Pj,)J + (10|/p|~^, 10000|/p|”^) 

(the sign depends on e and j.) Choosing ( with ^ = 1 on (—10000,10000) and ( 
supported on (—10001,10001), the fact that for P e T we have — log 2 (|/p|) = i 
mod 4000 then gives the lemma since 2^°°° • 10 > 10001. □ 
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Proof of Proposition \9.1\ The case j = 1,2 of the desired conclusion is standard 
and we could actually get s = 1, so the argument below (while applicable for all j) 
is only needed for j = 3. By Lemma 9.2 it suffices to fix a j-lacunary tree T c P 
and show that 


(29) 




PsT 


^ Dpi 




< 


r,s 


sup sup ( 7 ^ \ \f{x)\'^Xl{xf dx 

P,P'eT /pc/c/p, Vbl J 


l/s 


By dividing T into maximal subtrees with top, we may assume that Pt e T. Let 
R denote the right side of (29). If supp{f) c M\2/p then for each P e T 


(30) 


if, 


|/p|V2 


< c 




M-2 


\f{x)\xip{xf dx 


M-2 


R . 


With M sufficiently large (say M > 3), it follows that the left side of (29) is 
bounded above by 






CR 



Ip\^/^R 


Thus, it remains to prove (29) for functions supported on 2/p. From this support 
assumption, we see that it suffices (by choosing I = Jp) to show 


(31) 


PeT 


< o., ifh- 

' * \J-P ^ 


By the usual Rademacher function argument, the left side of (31) is 

^ sup II 2 ^H(/,0P,j) hip II 

{bpjp^T PsT 


where the sup remum is over all sequences {bp} of +l’s on T and hjp is the P 
normalized Haar function adapted to Ip. After fixing such a sequence and using 
duality, we are then reduced to showing the bound 


(32) 


'^bp {g,hip) (fpjW^P < Cr,s\\g\\L^' 

PeT 


where s' = s/ 
(33) 


(s — 1). Recalling the definition of (fpj the left side of (32) is 

2 ^p(^i^/p)V'Pj(a;)llLj'(W)- 




PeT 

|/p|>2'= 
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By Lemma 9.2 ^ the display above is 




> ^C(2 ^■) * [e 2 bp{g,hi^)ijpj]{x)\\Lsj^vi 


)• 


PeT 


By Lemma 7.1, the display above is 

^ 1 L bp {g, hip) i/’pjllis' 

PeT 

the second estimate follows from standard Calderon-Zygmund theory. □ 

10. A VARIATION-NORM SIZE INCREMENT LEMMA 

Proposition 10.1. Let P c &e a finite collection of tri-tiles, 6 > 0, r > 2, and 
j e {1,2,3}. Suppose that M (from the hypotheses of Theorem 6.f) is sufficiently 
large depending on 6. Then for each a satisfying 

sizej(P,/) ^ a 

we can find a collection of trees T, each contained in P, satisfying 


(34) 

sizej(P\ y T,/) < ^a, 


TeT 

(35) 

0 It\ < 

TeT ^ ' 


Below, we show how Proposition 10.1 follows from the variation-norm Bessel 
inequality. Theorem 11.1 The proof uses a standard stopping time argument, 
which we recall in order to note that our condition (|^ in the dehnition of strong 
j-disjointness is satished. 

Below recall that if T is ^-overlapping then for each j e {l,2,3}\{i} the sign 
Cij := sga{c{u;p-) — c{oj(^Pp').)) depends only on i, j, e, and not on T (for details 
see the discussion after Dehnition 5.2). 

Proof, (reduction to Bessel ineguality). By scaling / we may asssume that a = 1. 

It suffices to show that for each i e {1, 2, 3}\{j} we could hnd T satisfying (35) 
such that for each ^overlapping tree T c P\ T' we have 

(36) 


PeT 


1 

4 


Let Tq = Sq = 0 . Suppose Tq, ... ,Tn and Sq, ..., Sn have been chosen and set 

n 

Pn = P\{\jTk^Sk) 


fc =0 


If there are no f-overlapping trees T c P„ violating (36) then we hnish by setting 

T = {T.ILi u {S0U. 


^Here we use the fact that the variation over all k in ([3^ is the same as the variation restricted 
to 2^' ^ |/t| and —k = i mod 4000 which is the same as the restricted variation for convolution 
which is bounded by the variation over all k for the convolution. 
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Otherwise, if contains an f-overlapping tree violating (36) then we may choose 


such a tree T„+i so that eijc{uj(^p^ )^.) is maximal. We then let T„+i be the 


^n+1 


maximal (with respect to inclusion) f-overlapping tree contained in which 
satisfies Pt„+i = Pf„+i- maximal (with respect to inclusion) 

j-overlapping tree contained in P„\T„+i which satisfies Ps„+i = Pf„+i 

Since P is finite and Tn ¥= 0, this process will eventually terminate, yielding 
some 

T := {T0t, u {S0t,. 

We claim that the collection is strongly j-disjoint (recall that this is defined 

in Definition 5.3), and so Proposition 10.1 follows from Theorem 11.1 It suffices 
to verify condition ([^ and condition (^ of Definition |5.3 
In the following, let k ^ k', P e Tk, and P' e T^/. 

For (|^, assume that ojp. c ojp,^. Then |a;pj| ^ which implies that 

eijc{u)(^P^j.) > eijc{uj[p^^^)^) and so k < k.' But, since 3u(^p^j. c 30|e|a;p^- c 3ci;pj 
and P' ^ Sk we must have Ip/ n Ip^. = 0 ■ 

Now, to see condition (|^, by symmetry it suffices to show that Pj ^ {PTf,)j- 
First suppose that Pj = {PTk)j, or equivalently P' = Pp^. Then for each P e Tk 
we have Pj < P' ^ {PTy)i and so we must have k < k' or else every element of 
Tfc would have already been chosen in T^./. But, if fc < fc' then we would have 
P' e Tfc, contradicting P' e T^i. Now, suppose that Pj < {Pp^.)j- Then, as in the 


verification of (ii^jC{bJ(^PT^)j 
P' ^ Sk contradicts Pj ^ (Prj,)p 


> e 
□ 




c{uj(Pj, and so k < k.' But, the fact that 


11. A VARIATION-NORM BESSEL INEQUALITY 
In this section, we fix cr > 0 and assume that the order M of the wave packets 


(from the hypotheses of Theorem 6.4) is sufficiently large depending on a. Our 
goal here is to prove the following variation-norm Bessel inequality: 

Theorem 11.1. Let T be a collection of strongly j-disjoint trees, such that 


< 


(37) sup (— 21 (/,0pj) 

I dyadic |-^ | p^j' 

Ip^I 

for each T e T. Let P = UtstT"- Then 


< 




PeT 


Yj I ^Tj) 




II/IIl® ll/llia- 


PeP 


As in [7] , we prove Theorem 11.1 via a sequence of reductions 


11.1. Proof of Theorem |11.1| , reduction 1. Thanks to Lemma 11.2 below. 
Theorem |11.1| follows immediately from the following proposition: 


Proposition 11.1. Assume P and T as in Theorem 11. 1\ Then for all 6 > 0 


(38) 


2 I 




||-Ax||pc» 


PgP 


/,/( 


x)fxi{xy^ dx 


if Ip (z I dyadic for allT e T. 
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Lemma 11.2 below in turn is a result from [7] where it was proved using a series 
of interesting Lemmas. To keep the current paper self-contained, we’ll sketch a 
direct proof, which simplihes some arguments in [7]. To formulate the lemma, we 
hrst £x some notations. For S c T let Ns denote XItss define 

1 


IISMO sup 
/ dyadic 


Y, 

TeSdrczI 


Lemma 11.2. Let A,B > 0 and 0 < 5 < 1. Let T be a collection of trees, 
every subset S of T it holds that ||iVs||i < y4||A^s||^ and ||S||bmo ^ -S||^s||® 


If for 
then 


\\T\\bmo <s 




Proof of Lemma We hrst show that ||T1 |bmo 
show that for every dyadic interval Jq it holds that 
1 

Wo 


yl5<5/(l-6) 

^ (35) 


It suffices to 


(39) 




L 

TgT: 

Fix Jq. Let S contains all elements T e T such that It c Iq and the set {S' e T : 
It Is ^ Iq} contains at most (35)^/(^“‘^) elements. Clearly ||iVs||oo < 
therefore by the given assumption we have 

(40) \\S\\bmo < 5(35)'^/(^-'^) = 3V(i-5)5i/(i-5) _ 

Let J be the set of maximal dyadic intervals J ^ Iq such that the set {S' e T : 
J cz Is ^ Iq) contains more than (35)elements. Clearly, for every T e T\S 
such that Jr c Jo, It is contained in one of these J’s. It follows that 

L iTi YYY'’-' < 


(«) ^ L 


TeT\S 

It^Iq 


JsJ TsT 
It^J 




JeJ 


By maximality of J, there exists T e T such that It = J- Let Sj denote the 
collection of such T, then ||Sj||bmo = ||^Sj||oo = |Sj|, therefore using the given 
assumption we obtain ||iVs Joo ^ 5^(1“'^). For every x e J it follows that 


Ns{x) 






L 

TgT'.J^Ij'CzIq 


l/j,(x) 




> 


Together with , we obtain 

1 IT I I|T||bmo II^s||li(/o) 

K\ ^ I In I 2-3 F(i-5) 

' TsT\S:/tc/o 


2 

TsT: Jc/yc/o 
2.3‘5/(1-'5) 2j1/(1-<S) 




< 


\BMO 


\BMO 


2 • dFli--?) 


Using ([4b|), (39) immediately follows, completing the proof of ||T| 


BMO 


<A 5h(i-^). 


We now free the variables Jq, S, J to be used for other purposes below. 

Fix a large constant (7 > 0 to be chosen later. Let S contain all T e T such that 
It is not a subset of {x : Nt{x) > It is clear that ||A^'s||oo ^ 

so by the given hypothesis ||iVs||i ^ It suffices to show that 

(42) ||./V't\s||i = Os{C 

Indeed, from (42) by choosing C large we obtain ||A^"t\s||i ^ l/2|Wq 
||iVx||i < 2||A'"s||i which implies the desired estimate. 


II-^T 111) 


‘T T, 


thus 
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Let J be the collection of maximal dyadic subintervals of {x : Nt^{x) > 

It follows that if T e T\S then It is a subset of some element of J. Therefore 


WNr 


T\S II1 




< 


L L I 

JeJ TgT-.It'^J 


< 


Li 

JeJ 


IBMO 




\\T\\bmo\{x : Nt{x) > 

Since HTUbmo = we obtain ||A^t\s||i = Os{C~^\\Nr 


HAT. 


T 1 


BMO cb1/{1-5) ■ 
Till), as desired. □ 


11.2. Proof of Theorem |11.1| , reduction 2. We hrst note that (38) follows 
from the unweighted version where the factor xY* is not on the right hand side. 
Indeed, writing (/, (f)pj) = {fxYi X7 using the fact that Xi^^{x) is also 

a polynomial in x (which implies that is still a wave packet adapted to Pj 

of order sufficiently large, recall also that (j)pj and 'ippj are the same if j = 1,2 and 
related by a variational factor if j = 3), (38) follows from applying the unweighted 
version to fxY ^J^cl the rescaled wave packets. 

We now show that the unweighted (38) follows from the following proposition. 

Proposition 11.2. LetT be strongly j-disjoint. Let P = T. Then for every 

P with the following two properties: 


L ^ II^VtIIoo there exists P* 

(43) 


L 

pgp\p4 


(44) 


(/> 

I U 

PeP* 






PWfWl 

TsT 


Indeed, apply Proposition 11.2 with L = C'||A^'r||oo for a sufficiently large 6- 

it suffices to show 


dependent constant C. Now, to get (the unweighted) 

(45) V. . .. , , 1 


Yj (l>pg) 1^ ^ 2 ^ I 


PeP* 


PeP 


Let ^Po be a maximal interval in {Ip,P e P*}, and remove from P* all tri-tiles 
P such that P is in the same tree as Pq and Ip c Ip^. We repeat this process 
with what is left of P*. This algorithm gives a collection of tree T* such that 

l/r ^ Stet^It- Now, using (37) and (44), 


{Jt,T e T*} cover P* while XItet* 
s from the following 

L 


(45) follows from the following sequence of estimates and choosing C large: 

U 


(/, 0Pj) 






PeP* 




PeT* 

{L/C)L-^ ^ 


II iV^ 


TeT 


T* I 00 I 


< — 


< 


PeP* 
2 

C 


Y I 


PeP 


11.3. Proof of Theorem 11.1 , reduction 3. In this section we reduce Proposi¬ 
tion 11.2 to the following more technical result. We hrst hx some notations. Given 
o4 > 1 and d e {0,1, 2}, a collection of intervals X cx Qqis {A, (i)-sparse if 


• for each J e X, AI is d-regular (see Section]^; 

• for each 1,1' eX with |J| > |/'|, we have |J| ^ 2^°°^|J'|; 

• for each 1,1' sX with | J| = |/'|, we have dist(/, I') ^ 100o4|J' 
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Proposition 11.3. Let A^L^rj e (1, oo) and e > 0. Let T be a collection of 
strongly j-disjoint trees with IliVxIlL® < L. Let P = and suppose that 

{Ip : P e P} u [Ip : T e T} is {A, d) sparse. Then, there exists P* c P such that 

I U Ip\ 

PeP* 

Yj 

PeP\P* 






(A-^ + L-^) Y \It\ 

TeT 

{{ALY + L^A^-^fl 


and 


Hi- 


Assuming Proposition 11.3 we will prove Proposition 11.2 

Let T be j-strongly disjoint and P Utst 

By a simple pigeonholing argument, given any A > 1 we may partition P 
into subsets Pi,, P^ where L = (9(A^), with the following property: for each 
1 ^ k ^ L there exists d e {1, 2, 3} such that [Ip, P e Pfc} is (A, dj-sparse. 

This partition also lead to a partition of each TeT, therefore Pfc is also the 
union of a collection Tfc of j-strongly disjoint trees with ||A^Tfc||oo ^ II-^t||oo ^ L. 
Each tree in Tfc could be further decomposed into subtrees such that: each of 
the new subtrees contains its own top, and the top intervals of the subtrees are 
disjoint. We obtain Tj, a collection of trees with top, which is still j-strongly 


disjoint, furthermore ||iVr 


T'JIco 


< II 


^ L. 


We are now in a position to apply Proposition |11.3| for Tj., producing P 
Letting P 


Ul^fcscL Pfc 


and using L = O(A^) it follows that 

U Jp| <, {A^-^ + A^L~^) 2 \Ij 


PeP* 


PeP\P* 


if, (l>P,j) 




(A^+"T" + L^A^-Y 


TeT 
2 | 


The desired estimates for P* follows by letting e = 5/2, A = and p large. 


11.4. Proof of Theorem 11.1, reduction 4. Let I = [Ip '■ T e T}. Let > 0 


to be chosen later (depending only on p). For each / e I consider the Drj{A~^ + 
L~Y\I\ neighborhood of its endpoints, i.e. the set of x such that dist(a;, dJ) ^ 
Dfj{A~^ + L~'^)\I\. Let El be the union of these neighborhoods over / el, and 
let E 2 = [Yjiei^-^^i^ ^ then let P* be the set of all P e P such that 

Jp c Pi u p 2 - Using the Fefferman-Stein maximal inequality, it follows that 


U 

PeP* 


^77 




(A-^ + L-^) V |/| + L 


-(4+2r,) 


7gI 


YI.M 1 


2||2+r7 
I) \\2+r) 


lel 


{A-^ + L-^)\\Nr 


T T 


Let T 2 = {T e T : Jp cj; Pi u P 2 } and let I 2 denote the set of top intervals of T 2 . 
We now show that || XiTGT 9 (-^[liT])^lloo allow us to reduce 

Proposition [11.3 to Proposition |11.4| below. Since for each / e I 2 there are at most 
L elements of T with Ip = I, it suffices to show that 


L 

/GI 2 


{M[h]YxY 




uniform over x e M, which we fix below. By further dividing I 2 it suffices to prove 
that Xi/Gi 3 (-^[^i])(^)^ every I 3 c I 2 with the following property: if 

I,r E I 3 and |T| < |/| then |T| ^ 2“'^|/|. By further dividing I 3 we may assume 
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that one of the following situations occur: (i) x e I for all / e I 3 ; (ii) x is on the 
left of I for all / e I 3 ; (hi) x is on the right of I for all / e I 3 . 

Now, the desired estimate is clear for (i), so by symmetry we only consider 
situation (ii). By monotonicity we may assume further that x is the left endpoint 
of some J e I 3 . By definition of Ei it follows that for every J e I 3 — {J} we have 
dist(x,/) > L“’'max(|J|, |J|). Using the {A,d) sparseness of I 3 , it follows that 


ISI3 


< Yj iMli){xf + Yj {Mh){xf+ Yj iMli){xy 
<„ 1 + L'^2~^ + inf V {Mli){xy 

(using the definition of E 2 ). 


< 


Proposition 11.4. Let A,L,r]> 1 and e > 0. Let T he strongly j-disjoint with 

(46) I 2 < L ■ 

TsT 

Let P = Utst^- Assume that [Ip : P e P} u {Ip : T e T} is (A, d) sparse, and 

(47) sup dist(a;, d/y) ^ DnA~‘^\lT\ 

xelp 

for all P E P, T E T. Then for sufficiently large depending on p it holds that 

i;i{/.^PJ>r s,.. ((ALY + L'’-A‘-’>f\\S\l>. 

PeP 


11.5. Proof of Proposition [TT74 For convenience of notation, assume without 


loss of generality that A^L is an integer. By duality, it suffices to show 

(48) {ALY + L^A^-'^ 


PeP 


for every sequence {&p}pep such that ||5||£2(p) = 1, which we will fix below. 

Let J = {Ip : T e T} and let Ja = {{It) a : T e T} where {Ip) a is an interval 
in Qd (guaranteed by {A, d) sparsity) such that Alp a {Ip) a c 3AIp. 

From the {A, d) sparsity, it is clear that the map from Ip {Ip )a is bijective 
from J to Ja and that if Ij' ^ I'J'/ then {Ip)A £ {It')A- 

We now decompose Ja into “layers”. Let Jaa be the set of maximal intervals 
in Ja and for m ^ 1 let JA,m+i be the set of maximal intervals in Ja\ ur.iXi,n. 

Now, since {Ip)a ^ 3AIp for each T, using ( [4^ we have 

II J] IjIIz.00 ^ 16AY\YjiM^iT]?U ^ . 

JeJA T 

Thus, Ja,i,---,Ja,i6A2l partition Ja- Letting Jm = {J ^ J ■ {J)a e Xi,m} h 
follows that Ji,... Jiga^l partition J . Thanks to {A, d) sparsity of J again, this 
partition is consistent with the usual set inclusion ordering in J , in the sense that 
if J e Jm, J' e Ju) and J ^ J' then m > n. 

For J e 77 let m be such that J e J^, and define 


p,; := {P eP ■. Ip = J] 


;= {PeP :/pCj but IpfJ' for all J'e |J Jr^']. 

m'>m 
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We obtain the following partition of P: 

(49) P = U \J Pj^ P<J ■ 

L J&J^m 


By definition of 0p,j, it is clear that (48) will follow from the following estimates 
(80) I 2 ] 2 <>pi’P.l(^)\\Li{v;) £, I + \og- (AL) + A-'>L 


(61) 


JeJ PePj: |/p>2'' 

JeJ PeP^j: |/p> 2 '' 




{ALf + L^A^-^ . 


11.6. Proof of (50). Recall that || 6 ||£ 2 (p) = 1 . Recall that ippj is a wave fnnction 
of order M, which is assnmed sufficiently large compared to r]. We first estimate 
the error term 

E{x) := Z Z \^p'^pA^)\ ■ 

JeJ-. xf{J)A PePj 


Lemma 11.3. It holds that 

n 

IIZ Z bp^pA^)h ^ II z z z bp'4’p,jix)\\vp + 2E{x) . 

JeJ PePj: |/p|>2* m=ljGjm PePj 


Proof. Using the triangle inequality and the definition of E{x), the left hand side 
of the desired estimate is bounded above by 

^ II Z Z bp^/JpJ{x)\\v^ + E{x) 

JeJ-. xe{J)A PePj-. |/p|;s2'= 

= 11 Y Z ^P'^PdA)\\yi E{x) . 

JeJ: xe{J)A, |j|>2'= PePj 

Now, the intervals {J)a that contain x are nested, with larger interval belongs to 
some JA,m with smaller m, thus we could bound the last display by 

n 

^ II Z Z Z ^p'^pAAWv- + E{x) 

m=l JeJm-.xe{J)A PeP.j 

n 

^ II Z Z Z ^p'^pjiAWv^ + ‘^E{x) 

m=l JeJm PePj 

finishing the proof. □ 


Lemma 11.4. It holds that 

\\E{x)\\l2 <, A-^L. 

Proof. We note that any T e T contributes at most 0(1) tri-tiles to each Pj and 
such a contribution would necessitate J c Jp. Thus, |Pj| ^ ||-Nt||l°° ^ L, so 

iiS T 

JeJ PePj JeJ 

We also have 

||1r\A7p'0PjI|li ~M 
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which gives 


JeJ-. xf{J)A PePj 

Choosing M sufficiently large, depending on rj, the desired bound for ||-E ||2 
follows by an applic atio n of Cauchy-Schwarz. □ 

Applying Lemma 7.3, we see that for any sequence c {1, —1} we have 


leA^L 


ELL embpi’pjL^ < (l + log(L)) 

m=l JeJm PeFj 


therefore, by Lemma 7^2 we have 

n 

ILLS bpi)pj{x)\\L 2 (^vr) < [1 + log(16A2L)] • [1 + log(L)] 


m=l JeJm PePj 


which, combined with Lemma 11.4 and Lemma 11.3, gives (50). 


11.7. Proof of (51). Here, we use two error terms 


leA^L 


^i(^) = Tj Yj Y 


m=l JeJm PeP<j 
xfj 

IQA^L 

E2{x) = Y, Y Y Y \^P'^Pdi.^)\ 

m=2 J^Jm m'<m PgP > 
rpd 7 <J^ 

where, if J e J'm and m' < m then we let J^' denote the unique element of Jm' 
such that {J)a c (J™'')a- 

Lemma 11.5. It holds that 

n 

II 2 Y bpiljpj{x)\\vr < Ei{x) + E2ix) + W Y Y Y bp'4’p,jix)\\v;^ 

m=l JeJm PeP<j 


JeJ PeP<j 


+ ( S II S bp'^pA^Wvd 

TTl^jG^m. PgP ^j'. |/p|^2^ 


2 \V 2 


Remark: A simpler analogue of Lemma 11.5 was considered in [TJ Lemma 12.2]. 
Our Lemma 


Lemma 12.2 


11.5 (and the following Lemma 11.7) in fact fills in a small gap in [TJ 


, where an error term similar to E 2 was not treated. 


Proof. By the triangle inequality 


Y Xj ^p'^pA^)\\yi ^ 


JeJ PgP<j 
|/p|;s 2 '= 


L 

JeJ PgP<j 
Op > 2 '= 


2 ^p^pj(^)llLr+ ^l(^) 


Let Ji c ... c JjY be the (nested) intervals in J that contain x. Choose 
fci,..., fcjv so that 2^' = \Ji\, which (together with N) are functions of x. Then, 
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the first term on the right of the last display could be rewritten as 

- II E S hpi>pj{x)\v^ < II + E2{x) 


IsSfjgA PsP 


■^l 

|/p|> 2 '= 


PeP< 






where, for the inequality above, we use the fact that J^+i c and so {J£+i)a £ 
{Je)A- Using a long jump/short jump decomposition of the variation-norm, the 
first term on the right side of the inequality above is < -I- ^ 2 , where 

A = 11 ^ ^ bp'ipp,j{x)\\vr 

^ PeP^j^-. \Ip\^2>=", \Ip\^2'’t+i 


A = 2(2 II 2 2 bpijpj{x)fyr 


d /2 


n I 


PeP 


<Ji 


k„^l^k<k„ 

\Ip\^2'‘, |/p|> 2 '‘^+l 

It is clear that 

A - IE E bp'i/jpj{x)\\vr ^ II E E bp^/JpJ{x)\\vr + E2{x) 


i<n PeP< 




£<n PsP< 


A 


|/p> 2 '‘^+i 


E E E bp'4’p,j{x)\\v^ + E2 {x) 


m=l JsJm PeP<j 
xej 


< IE E E bptjjpj (x) II V- + El (x) + E 2 (x) , 

m=l jGjm PeP<j 


•Ao — 


KEll E bpijjpjix) + E E bpi^pj 


,x, 


n E 6 P<j„ 
I Ip I > 2 * 


2(211 S AAjAII 


t<n PeP<j^ 
|7p>2''<?+i 
1/2 




k^j^l^k<kn 


1/2 




n PeP<j„ 

|/p|> 2 '' 


fc„+is;fe<fen 


^ 2 II Z! AAi(a^)ll 


VI 


m,JeJ^ PsP<j 

|/p|;s 2 '= 


1/2 


□ 


Lemma 11.6. It holds that 

\\Ei{x)\\l2 <, LA-V 

Proof. Using Cauchy-Schwartz it suffices to show that for each m 

II 2 Z IAAi(a^)|||L2 ^^'^^^"''l|A||£2(Ujej,„P<j) • 

JeJm- xifj PeP<,/ 

Using the Fefferman-Stein maximal inequality and the fact that the intervals in 
Jm are disjoint, it suffices to prove that if J e Jm and x f J then 

2 \bpi’p,j{Ej\ <r, ^11 A||£2(p^j)|J| ^'^^(Al[lj](a;))^. 

P 6 P<J 
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Now, if T intersects P<j then J c therefore using (46) we see that at most 
L trees in T contribute a given P<j. Thus, using Cauchy Schwarz it suffices to 
show that, for each T e T, 

(52) 2 S- 

PeTnP^, 


Choosing large enough, the {A,d) sparsity and (47) imply that for each P in 
the sum above 

supdist(i/,dJ) ^ 
yelp 

Recall that M is the order of the wave packet 'ippj- Thus, for x ^ J , choosing M 
large enough we obtain 

\^pAx)P <„ 

DpI 

<, A-^ATrirV\-\M[ij](x))\ 

Mp 


Summing over P e P<j n T we obtain (52). □ 

Lemma 11.7. It holds that 

\\E^{x)\\l. <, 

Proof. By Cauchy-Schwarz, it suffices to show that for 1 ^ m' < m we have 

II Xj 2j l^pV'Pj(a;)|||L2 

JeJ^ PeP^j^, 

|/p|<|J| 

The above bound will follow, by Cauchy-Schwarz, from the following two estimates 


(53) 

II L L 

JeJ^ PeP^j^, 
|/p|<|J| 



1 

to 

03 

(54) 

II L L w 

|7p|<|J| 

■^/>pV'p,i(x)|||pi 

< 

P<j) 

m' 


To see (53) fix x and choose the unique J e Jm with x s J. As in Lemma 11.6 


it suffices to show that for each T e T 


(55) 




\Ip\'Ai’P.i(r)\ S.I A 




Choosing large, it follows (as in Lemma 11.6) that the following holds for every 
P in the sums above: 

inf dist(i/,dJ) ^ • 

yelp 

Since \Ip\ < |J| and P s P^j^/, it follows that Ip n J = 0. Since x e J, using 
{A,d) sparseness and (47) we obtain 

dist(a;, Jp) ^ 2^9^|J|^/2 |j^|1/2 _ 
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Due to the restriction of the sum to tiles in a single tree, each dyadic interval is 
the time interval of at most 0(1) tri-tiles, and so for each fc > 0 






PeP^j^,nT: |/p|=2-'=|J| 


and summing over k gives (55). 


To see (|54|) simply use the fact that the intervals in J7m are pairwise disjoint to 

|-fp| 


estimate the left side by 

^ I S 


-1/217,2 


\bpijpj{x)\\\Li < || 6 ||^ 2 (u 


JeJ, 


,P<j) 


□ 


Applying Lemma [7.2| and Lemma 7.3 as in the proof of (50) we have 

n 

II Zi Zi Z bp^/JpJ{x)\\L2^vr) ^ l + \og\AL) 


m=l jGjm PeP< 


Thus, using Lemma 11.5 , Lemma 11.6 , and Lemma [11.7 , to finish the proof of 
(51) it suffices to establish the following inequality (for each m and J e [J^ J'l): 


(56) 


^ 6pV^Pj(a:)||£,2(yr) <, L^\\bp\\e2{p^j). 

PgP<j: |/p|>2'= 


Let Tj be the collection of trees in T which contribute to P<,/. As above, we have 
|Tj| ^ L. For each T e Tj let = (^{^{Pt)])- Then, for each P e T e Tj we have 

^Pj ('Ct — 10 |e| ■ IcupJ,+ 10|e| • IwpJ). 

Furthermore, from condition (|^ in the definition of strong j-disjointness and the 
fact that J c Jp for each T e Tj, we have that 


dist(^T,iVpJ ^ \(^Pj\/4: 


for each P e P<j and each T e Tj. Therefore, if we let 

Rk = \J{^T-10\e\2-\^T + 10\e\2-^) 

TgTj 

and Llfc be the Fourier projection operator nfc[/] = P~^[lp^f] then, for each 
k = —i mod 4000 we have 

(57) ^ bp'ippj = nfc[ ^ bp'ijjpj]. 

PeP<j P6P<j 

|/p|> 2 '= 


Thus, by Theorem 8.1 and Lemma 7.3 we have (56). 


12. Concluding the proof of Theorem 16.41 
Let P be a hnite subset of Pj^. Our aim is to prove that the trilinear form 
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satisfies restricted weak-type estimates with exponents a arbitrarily close to any 
vertex of A define by (14), with implicit constants uniform over P. We’ll consider 
neighborhoods of Ai{^Y, 1), the other vertices could be treated similarly. 

By (dyadic) dilation symmetry we can assume |Fi| e [1/2,1). Fix s > 1 close 
to 1 to be chosen later, and choose 

B - \j{M‘[lF,]i>C\F,f/‘} , 

J = 1 

with C sufficiently large so that \B\ ^ 1. Let |/i| ^ 1 fi-b and I/ 2 I ^ 1 f2 and 
l/sl < 1^3- Decompose P = where 

Pfc = {P e P ; 2^ ^ 1 + dist(Jp, P")/|/p| < 2^+^}. 


For P e Pfc we have 


ip^i 


I 


sup TTT \fj{x)Yxi{xf dx 


l/s 


< 2^/" sup 

ipci 


|2^J| 


\fjix)\"X 2 '‘i{xy dx 


l/s 


< 2^/^ inf M^[IfA{x) 

xe2^Ip 


Therefore, by Proposition 9.1, for j = 2, 3 we have 

(59) S, := size,(Pfc,/,) < 2^/^|P,|Vy 

Here (and below) the implicit constants may depend on r, s, and [3i (dehned 
below). Now, when j = 1 we will obtain the improved estimate 

(60) := sizei(Pfc,lpc/0 ^C'2-(^-2)^ 

by exploiting the fact that the interval I in the last sup has to be contained inside 
another I pi for some P' e P^. 

Now, applying Proposition 10.1 repeatedly, we obtain a decomposition of P^ 
into collections of trees {TAnei with 


(61) 


L |/tI S 2 ”, 


TgT„ 


such that for any T e T„ we have 
(62) size,(T,/i) < 

Now, for any tree T we have 


(63) 


L l^p|"‘’Tll(A'('p,.>l « 3|/Tinsize.(r,/.). 


PgT 


2=1 


2 = 1 


To see (63), by further decomposing T if needed we may assume that T is i- 
overlapping for some i e {1, 2, 3}. Then estimating 

< sizei(T,/i) 

and applying Cauchy-Schwarz to estimate the remaining bilinear sum by 


< 


I Yl sizej(T,/j) 

je{l,2,3}\{2} 
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one obtains (63). 


Applying pl| ), ( [62| ), ( |6^ , we obtain 


2=1 


|Afc(/l,/2, /s)! ^ \Ip\ PI I (/i) 0 P,i) I- 

PgPj. i=l 

For any /?i, ( 32 , ( 3 ^ e [0,1], we obtain 

|Afc(/i,/2,/3)| < ^2^3 2 2" min (l, 2- 




riTFsA 

n 2=1 

The above estimate is a two sided geometric series if we choose /Sj’s such that 
(3i+(32+(33 > 2s (which is possible for s close to 1). Letting 7 j := 2s(3i/{(3i+(32+(33) 
we obtain 


2=1 
k I 


< 2s(2-72-73-s(M-2)(1-7i)) ( 

2=1 


1 /s 


(using (59), (60)). 


Again assuming that (3i + (32 + (33 > 2s we are guaranteed 7 i < 1 and so, choosing 
M large enough depending on (3 we may sum in k to conclude 

ia(/,,/2,/3)i s (nipi‘^^ 


2 = 1 


l/s 


Since iFi 


1, we can ignore its contribution in the above estimate. Now, by 


sending [s, j3\, (32, (33) to (1,1,1,0) inside the region {/li + (32 + (33 > 2s} n {0 < 
f^ 2 , f ^3 ^ 1 < s}, we obtain the desired claim. 

13. Proof of Theorem 16.51 


The proof of Theorem 6.5 is entirely similar to the proof of Theorem 6.4 


essen¬ 


tially the main difference is that variation-norm estimates such as the continuous 
Lepingle inequality (see Lemma 7.1) is replaced by the classical Littlewood-Paley 
square function estimate. We briefly discuss the cosmetic changes, the details are 
left to the reader. We may define = 'tppj for j = 1,2, and 0p^3 = 'iljp^ 3 dn if 
|/p| = 2"' and 0 otherwise. 

Now, the sizes are defined exactly as before, and to get the size estimates for 
size 3 (P, /) (as in Proposition |9.1[ ) we use the same proof, the only difference is near 
the end we appeal to the classical estimates for the Littlewood-Paley square 
functions associated with scales of the underlying tree, instead of the continuous 
Lepingle inequality. 

Now, to get the size increment estimate (as in Proposition [TOT]) we use the same 
reduction to a Bessel inequality as in Theorem |11.1[ To prove this Bessel estimate 
for the new (f)p^ 3 , we follow the same sequence of reductions and the proof reduces 
to proving Proposition 11.4 with the new modihed wave packets. We perform the 
same partition of P as in (49), and it suffices to show the following two analogues 
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of (50) and (51). Below we let Sk denote the £2 snm of a seqnence indexed by k 
and {bp) is a seqnence on P with normalized f'^(P) norm. 

(64) WYj S bpi/jp^3{x)\\L2^s^-) <r, l + \og^{AL) + A-^L 

JeJ PeFj: |/p|=2fe 

(65) 12 2 '>pVp,3(i)|lLi(s.) Sp, (ALT + FA^-i . 

JeJ PeF^j: |/p|= 2 '' 

The proofs of these two estimates are similar. We’ll nse the same error terms 
E{x), Ei{x), and E 2 {x), and nsing analognes of Lemma 11.3 and Lemma 11.5 the 
proofs of and ( [^ rednce to proving 

( 66 ) 11(2 (S bp'iljp^3)y^^\\2 < [1 + log(16/l2L)][l + log(L)] 

m=l JeJm PeFj 

(67) 1 ( 2(2 2 ^ [l + \og{lQA^L)][l + \og{L)] 

m=l jGjm PeP<j 

( 68 ) ll(2( 2 ^P'*/’P,3)^)^'^^||2 L^\\bp\\£^{F^j) 

k PeP<j: |/p|=2'' 

We note that the sqnare norm is bonnded above by the 2-variation norm. Thns, 
nsing Lemma 7.2 the estimates ( 66 ) and (67) follow from Lemma 


nsing Proposition 8.2 and the Fonrier projection representation (57), the estimate 


(68) follows from Lemma 7.3 


7.3 Similarly, 


Acknowledgement. This work was initiated while the anthors were visiting the 
University of California, Los Angeles in Winter 2012, and the visit was snpported 
in part by the AMS Math Research Commnnities program. The anthors wonld like 
to thank the MRC and Christoph Thiele for their generons snpport, hospitality, 
and usefnl conversations. 


References 

[ 1 ] Jeremy Avigad and Jason Rute. Oscillation and the mean ergodic theorem for uniformly 
convex banach spaces. Ergodic Theory and Dynamical Systems, to appear. 

[2] J. Bourgain. Double recurrence and almost sure convergence. J. Reine Angew. Math., 
404:140-161, 1990. 

[3] Ciprian Demeter. Pointwise convergence of the ergodic bilinear Hilbert transform. Illinois 
J. Math., 51{4):1123-1158, 2007. 

[4] Ciprian Demeter. On some maximal multipliers in L^. Rev. Mat. Ibero., 26(3):947-964, 
2010 . 

[5] Ciprian Demeter. Improved range in the return times theorem. Canad. Math. Bull., 
55(4):708-722, 2012. 

[6] Ciprian Demeter, Michael T. Lacey, Terence Tao, and Christoph Thiele. Breaking the du¬ 
ality in the return times theorem. Duke Math. J., 143(2):281-355, 2008. 

[7] Ciprian Demeter, Terence Tao, and Christoph Thiele. Maximal multilinear operators. Trans. 
Amer. Math. Soc., 360(9):4989-5042, 2008. 

[8] Yen Do, Camil Muscalu, and Christoph Thiele. Variational estimates for paraproducts. 
Revista Mat. Ibero., 28(3):857-878, 2012. 

[9] Yen Do, Richard Oberlin, and Eyvindur Palsson. Variational bounds for a dyadic model of 
the bilinear hilbert transform. Illinois J. Math., 53(2):105-119, 2013. 

[10] Roger L. Jones, Andreas Seeger, and James Wright. Strong variational and jump inequalities 
in harmonic analysis. Trans. Amer. Math. Soc., 360(12):6711-6742, 2008. 























ESTIMATES FOR ERGODIC BILINEAR AVERAGES 


37 


[11] Vjeko Kovac. Quantitative norm convergence of double ergodic averages associated with 
two commuting group actions. Ergodic Theory Dyn. Syst., to appear, 2014. 

[12] Michael Lacey and Christoph Thiele. On Calderon’s coniecture. Ann. of Math. (2), 
149(2) :475-496, 1999. 

[13] Michael Lacey and Christoph Thiele. A proof of boundedness of the Carleson operator. 
Math. Res. Lett., 7(4):361-370, 2000. 

[14] Michael T. Lacey. The bilinear maximal functions map into for 2/3 < p ^ 1. Ann. of 
Math. (2), 151(l):35-57, 2000. 

[15] Camil Muscalu, Terence Tao, and Christoph Thiele. Multi-linear operators given by singular 
multipliers. J. Amer. Math. Soc., 15(2):469-496 (electronic), 2002. 

[16] Camil Muscalu, Terence Tao, and Christoph Thiele. estimates for the biest. 11. The 
Fourier case. Math. Ann., 329(3):427-461, 2004. 

[17] Fedor Nazarov, Richard Oberlin, and Christoph Thiele. A Calderon Zygmund decomposi¬ 
tion for multiple frequencies and an application to an extension of a lemma of Bourgain. 
Math. Res. Lett, 17(2-3):529-545, 2010. 

[18] Christoph Thiele. The maximal quartile operator. Rev. Mat. Lberoamericana, 17(1): 107-135, 
2001 . 

Department of Mathematics, The University of Virginia, Charlottesville, VA 

22904-4137, USA 

E-mail address: yendo@virginia.edu 

Department of Mathematics, Florida State University, Tallahassee, FL 32306- 

4510, USA 

E-mail address: roberlin@math.fsu.edu 

Department of Mathematics and Statistics, Williams College, Williamstown, 

MA 01267, USA 

E-mail address: eap2@willicmis.edu 



